Using Few Clues Can Compensate the Small Amount of Resources Available for Word Sense Disambiguation

نویسندگان

  • Claude de Loupy
  • Marc El-Bèze
چکیده

Word Sense Disambiguation (WSD) is considered as one of the most difficult tasks in Natural Language Processing. Probabilistic methods have shown their efficiency in many NLP tasks, but they imply a training phase and very few resources are available for WSD. This paper aims at showing how to make the most of size-limited resources in order to partially overcome the knowledge acquisition bottleneck. Experiments are performed within the SENSEVAL test framework in order to evaluate the advantage of a lemmatized or stemmed context over an original context (inflected forms as they are observed in the rough text). Then, we measure the precision improvement (about 6 %) when looking at the inflected form of the word to be disambiguated. Lastly, we show that it is possible to reduce the ambiguity if the word to be disambiguated has a particular inflected form or occurs as part of a compound.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word Sense Disambiguation Using Neural Networks with Concept Co-occurrence Information

Most previous word sense disambiguation approaches based on neural networks were impractical due to their huge feature set size. We propose a method for resolving word sense ambiguity using neural networks with refined concept co-occurrence information (CCI) as features. Using CCI refinement processing, we reduce the number of features of the network to a practical size. We also show that word ...

متن کامل

Principled Disambiguation: Discriminating Adjective Senses with Modified Nouns

Recent corpus-based work on word sense disambiguation explores the application of statistical pattern recognition procedures to lexical co-occurrence data from very large text databases. In this paper we argue for a linguistically principled approach to disambiguation, in which relevant contextual clues are narrowly defined, in syntactic and semantic terms, and in which only highly reliable clu...

متن کامل

Can multilinguality improve Biomedical Word Sense Disambiguation?

Ambiguity in the biomedical domain represents a major issue when performing Natural Language Processing tasks over the huge amount of available information in the field. For this reason, Word Sense Disambiguation is critical for achieving accurate systems able to tackle complex tasks such as information extraction, summarization or document classification. In this work we explore whether multil...

متن کامل

Extracting Ontological Relations of Korean Numeral Classifiers from Semi-structured Resources Using NLP Techniques

Many studies have focused on the facts that numeral classifiers give decisive clues to the semantic categorizing of nouns. However, few studies have analyzed the ontological relationships of classifiers or the construction of classifier ontology. In this paper, a semi-automatic method of extracting and representing the various ontological relations of Korean numeral classifiers is proposed. Sha...

متن کامل

Towards Cross-Language Word Sense Disambiguation for Quechua

In this paper we present initial work on cross-language word sense disambiguation for translating adjectives from Spanish to Quechua and situate CLWSD as part of the translation task. While there are many available resources for training Spanish-language NLP systems, linguistic resources for Quechua, especially Spanish-Quechua bitext, are quite limited, so some ingenuity is required in developi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000